Chicon - A Chinese Text Manipulation Language
نویسندگان
چکیده
Text processing is an important computer application. Due to its importance, a number of text manipulation programming languages have been devised, e.g. Icon. These programming languages are very useful for applications such as natural language processing, text analysis, text editing, document formatting, text generation, ... etc. However, they were mainly designed to handle English texts and are ineffective for Chinese. This is because English and Chinese texts are represented very differently in a computer. An English character is mainly represented in 7-bit ASCII and its Chinese counterpart commonly in 16-bit GB or BIG-5. This difference makes direct application of English-based text manipulation programming languages to Chinese erroneous, e.g. application of Icon to reverse a string of Chinese characters. In this paper, a new dialect of Icon, referred to as Chicon (i.e. Chinese Icon), is proposed. In the design of Chicon, new data types were introduced to differentiate pure English and English/Chinese mixed texts. In addition, existing Icon text manipulation functions were modified to account for Chinese texts. Experiments have shown that Chicon not only could overcome the problems of Chinese processing in Icon, but its execution speed was actually superior to Icon in handling Chinese. Furthermore, application of Chicon to a real sized problem, namely word segmentation, has proved that the language is practical.
منابع مشابه
Cohesive Readability of Expository Texts and Reading Comprehension Performance: Iranian EFL students of Different Proficiency Levels in Focus
Abstract The present study is an attempt to investigate the relationship between cohesive readability of expository texts and reading comprehension in EFL students with different proficiency levels. One hundred students formed the participant of this study. They were undergraduate students majoring in English at University of Isfahan. To collect the relevant data, participants were divide...
متن کاملCohesive Readability of Expository Texts and Reading Comprehension Performance: Iranian EFL students of Different Proficiency Levels in Focus
Abstract The present study is an attempt to investigate the relationship between cohesive readability of expository texts and reading comprehension in EFL students with different proficiency levels. One hundred students formed the participant of this study. They were undergraduate students majoring in English at University of Isfahan. To collect the relevant data, participants were divide...
متن کاملManipulation in advertising text: lexical and semantic aspect
The present paper focuses on the questions of modern advertising science, structure of advertising and elements making actual manipulative influence from the addresser. Advertising encourages product sales, is an instrument of forming ethical standards, values, creating cultural values, standards and mode of behavior that is why the wide system of means for achieving aims of advertisers is need...
متن کاملThe Modeling and Realization of Natural Speech Generation System
The paper gives an overall discussion on problems in Chinese natural speech generation. We considered not only how to convert text into speech but also how to generate the necessary text in text-tospeech conversi0n.A Chinese Bi-directional Grammar is developed to suit for Chinese Language understanding and generation. The system gets the right text and generates speech which have good quality i...
متن کاملMainland Chinese Students’ Shifting Perceptions of Chinese-English Code-Mixing in Macao
As a former Portuguese colony, Macao is the only region in China where Cantonese, a variety of Chinese, and English, an international language, are enjoying de facto official statuses, with Putonghua being a quasi-official language and Portuguese being another official language. Recently, with an increasing number of Mainland Chinese students crossing the border to pursue their tertiar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Softw., Pract. Exper.
دوره 28 شماره
صفحات -
تاریخ انتشار 1998